Skip to content

feat: add LiveKit provider (Phase 1 - data channel)#3

Open
Mohith1612 wants to merge 3 commits intoArchishmanSengupta:mainfrom
Mohith1612:main
Open

feat: add LiveKit provider (Phase 1 - data channel)#3
Mohith1612 wants to merge 3 commits intoArchishmanSengupta:mainfrom
Mohith1612:main

Conversation

@Mohith1612
Copy link
Copy Markdown

What this adds

A new livekit provider that runs adversarial eval conversations over LiveKit data channel messages, no audio required. This lets you run the full autovoiceevals optimization loop against any agent deployed on a LiveKit room.

How it works

The caller bot joins a uniquely-named LiveKit room and exchanges turns as JSON data messages on a configurable topic (default: "text"):

  • Caller → Agent: {"role": "user", "content": "<turn>"}
  • Agent → Caller: {"role": "assistant", "content": "<reply>"} (plain text also accepted)

Rooms are named {room_prefix}-{scenario_id}-{uuid4} to prevent cross-talk between parallel evals.

Files changed

File Change
autovoiceevals/livekit_provider.py New LiveKitClient (264 lines)
autovoiceevals/config.py New LiveKitConfig dataclass
autovoiceevals/pipeline.py Wire livekit in _build_provider()
autovoiceevals/researcher.py Wire livekit in _build_provider()
examples/livekit.config.yaml Example config
.env.example LiveKit env var template
requirements.txt Optional dep comment (livekit>=1.0.0)
README.md Document LiveKit in providers, setup, structure

Setup

pip install "livekit>=1.0.0"
LIVEKIT_URL=wss://your-project.livekit.cloud
LIVEKIT_API_KEY=your-key
LIVEKIT_API_SECRET=your-secret
provider: livekit
livekit:
  url: "wss://your-project.livekit.cloud"
  room_prefix: "eval"
  data_topic: "text"
  response_timeout: 30
  agent_join_timeout: 30
  agent_backend: "none"

Prompt management

  • agent_backend: "none" use the livekit provider for conversations only; manage prompts externally (raises NotImplementedError on prompt read/write)
  • agent_backend: "smallest" delegates get_system_prompt / update_prompt to SmallestClient

Why "Phase 1"?

This implements text-only evals. A future Phase 2 would support audio (microphone/speaker tracks). Phase 1 is already useful for any agent that processes text data messages, which covers the majority of LiveKit agent frameworks (e.g., LiveKit Agents SDK with a DataChannel handler).

Bug fix included

Fixed a TypeError in run_conversation caused by a signature mismatch when called from pipeline.py / researcher.py with scenario and dynamic_variables kwargs.

Implements LiveKitClient using LiveKit data messages for text-based
adversarial evals without audio. Includes config integration, pipeline
and researcher wiring, example config, env vars, and a fix for the
run_conversation signature mismatch that caused TypeError when called
from pipeline/researcher with scenario/dynamic_variables kwargs.
Add LiveKit to the supported providers list, setup instructions (env
vars, config copy command), providers table, and project structure.
@ArchishmanSengupta
Copy link
Copy Markdown
Owner

thanks @Mohith1612 for the contribution. i will take a look into the PR shortly.

@ArchishmanSengupta
Copy link
Copy Markdown
Owner

Thanks for the LiveKit update @Mohith1612 — I validated this PR against a real LiveKit Cloud deployment and wanted to share the results.

What works

  • LiveKit connectivity/authentication is valid.
  • Explicit agent dispatch works: the agent is successfully dispatched and joins the created room (agent-AJ_... participant appears).

What is still broken (blocking end-to-end)

  1. research mode cannot run for LiveKit-only provider

    • Running python main.py research -c examples/livekit.config.yaml fails immediately with:
      NotImplementedError: No agent_backend configured for LiveKit provider...
    • researcher.run() currently requires prompt read/write, but LiveKitClient only supports that when delegating to smallest.
  2. No data-channel response from the agent

    • After successful dispatch + join, I sent multiple test packets:
      • topic text, chat, and ""
      • JSON payload {"role":"user","content":"..."}
      • plain-text payload
    • In all cases: no response packet received from the agent.

    Would you be able to test it on a live agent on livekit and see if autovoiceevals works end to end after your fixes?

@ArchishmanSengupta ArchishmanSengupta self-requested a review March 31, 2026 23:25
@Mohith1612
Copy link
Copy Markdown
Author

Hey @ArchishmanSengupta, thanks for the thorough review and the live deployment test, really helpful.

Issue 1: research mode crash: fixed

Added a LocalPromptBackend class that manages the system prompt locally (in memory or a file on disk) without needing an external API. Setting agent_backend: "local" in config with a system_prompt now lets research and pipeline modes run without hitting NotImplementedError. Both researcher.py and pipeline.py are wired up for it.

Also added an inject_system_prompt flag when true, the current prompt is sent as {"role": "system", "content": "..."} as the first data message before caller turns each conversation, so agents can apply prompt changes at runtime. The example config (examples/livekit.config.yaml) is updated to document all new options.

Issue 2: no data-channel response: partially addressed

Made the receiver more robust:

  • Switched put_nowait to loop.call_soon_threadsafe and asyncio.get_running_loop() for correctness across threading contexts
  • Added topic filtering (ignores packets not on data_topic) and self-send filtering by participant identity
  • Drains stale queue entries before each turn to prevent a late response from one turn being consumed as the reply to the next
  • Added debug-level logging run with logging.DEBUG to see exactly what arrives on the data channel

However after cross-referencing the latest LiveKit Python SDK docs, the client-side code is confirmed correct. The root cause of no response is almost certainly that the agent being tested doesn't implement a data channel listener at all. The protocol requires the agent to explicitly:

  1. Subscribe to data_received
  2. Parse {"role": "user", "content": "..."} packets
  3. Publish back {"role": "assistant", "content": "..."} on the same topic

If the agent is audio-only (which is the default for most LiveKit agents built with livekit-agents), it will join the room successfully but silently ignore data channel messages. There's nothing the caller side can do to force a response this needs to be implemented in the agent itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants